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AVOIDANCE OF CACHE SYNONYMS The invention includes a method of operating a branch 

target buffer, comprising a cache memory with a CAM 

The present invention relates to cache memories and . portion having a plurality of lines coupled to a respective 

methods of writing data such as instructions into a cache line of a data memory portion, to prevent a single branch 

memory of a computer system. 5 instruction being stored as a multiple entry in said buffer. • 

« a nvp^r.^ _ TTrT , v _^ XT Wnicn rnethod comprises selectively controlling validity of • 

BACKGROUND OF THE INVENTION - . one or more lines in the cache, invalidating one line of the, 

To run a computer program, a computer repetitively cache * %VTkin S data including an address To said one line at 

carries out a sequence of functions, typically fetching an thc same time as an associate, operation is carried out using 

instructioQ held at a given address, decoding the instruction. 10 * e same address * controlling re-validation of said one 

accessing an operand for use by the instruction, executing ^ m de Pendence on the result of the associate operation, 

the instruction, storing the result of the execution and thereby validating said line only if no cache hit is found. In 

determining the next instruction address. Problems can 0DC 5 m bodiment said CAM is used to store address data 

occur where an instruction contains a test whose result relating to a plurality of branch instructions and said data 

determines the address of thc next instruction to be executed. 15 ^^ory portion is used to store for each branch instruction 

An instruction of this type is known as a conditional jump m me cachc - a tar get address for the branch instruction 

The consequence of the presence of a conditional jump 1^!?"^ * P* 5 ** 0 * . of wheth « thc branch will be 

instruction is that the kstruction typically has to pS ^ ? V ^5*°* 15 * a Pressor, said 

through several pipeline stages in a processor before the test , ft 2 ° effecUn * ! "Ration of at least one of 

is resolved, and before the next instruction to be fetched can 20 £M d ^ ° f J"**" f *? ^ ? * CachC 

be determined with certainty. This can delay the pipeline d ^? g S * d *™f PP"*^ ™™ bl V ** d memory 

process portion also holds a prediction strength value in addition to 

. said prediction, and said write operation is effective to 

m A program sequence may also include non-conditional modif y said prediction strength value in the event of a cache 

jump instructions. Such instructions, if executed result in 2 5 hit during the write operation. 

the program sequence jumping to a new instruction. ^ . „ . u 

_ • * * I ' . . . . The invention includes a cache memory comprising a 

Forftepur^sesof content addresS able memory (CAM) portion having a plu- 

term branch instruction^ wiU be used to include both rant y of lines coupled to a respective Une of a data memory 
conditional and non-conditional jump instructions. ^ porti on. ^ C AM portion having an associate input for 

> In this specification the term instruction includes primi^ 30 inputting data to effect an associate- operation and a. write 
tive operations which may be included in a VIJW system input for inputting data to effect a write operation, said 
using Very Long Instruction Words. An instruction word or associate input being connected to input selection circuitry 
sequence may therefore comprise a VLTW instruction. to select either write data or associate data for connection to 

To reduce delay in determining the next fetch address. sa * d associate input whereby an associate operation may be 
branch instructions may be loaded into a cache memory. 35 effected in parallel with a write operation using common 
such as a branch target buffer to try to predict a new address data, said memory further comprising line validation , 
instruction target address. circuitry operable in response to a result of an associate 

It is an object of the present invention to avoid one operation to prevent data related to said common address 
addressable entry, such as a branch instruction, being ^ bcing held ' m morc one ^ of said cache - 
entered on two or more lines of the same cache. The invention includes a branch target buffer including a 

cache memory as aforesaid in which said cache includes 
SUMMARY OF THE INVENTION data modification circuitry operable in response to a write 

The invention provides a method of loading entries into a operation to modify data stored for a branch instruction 
cache memory comprising a content addressable (CAM) 45 locatcd b >' a cache Ut durin S a write operation. Preferably 
portion having a plurality of lines coupled to a respective - said mcmor y P 0 ** 0 * deludes storage circuitry for 
line of a data memory portion, which method is character- st °nng a prediction strength value in addition to said pre- 
ised by effecting in parallel an associate operation and a diction for branch held in said cache, and 

write operation with the same address data and using the said stora Se circuitry is responsive to said data modification 
result of the associate operation to control validation of the 50 cuxuitr > r t0 modify said prediction strength value in response 
write operation and thereby prevent a single address havin* t0 a write ^ration relating to a branch instruction in said 
two entries in the same cache. Preferably the method ' buffcr ^ invention includes a computer system compris- 
mcludesselectivervcontroliing validity of one or more lines a memor y for holding a plurality of instructions, a 

in thc cache, invalidating one line of the cache, writin* data Processor for executing a plurality of instructions sequen- 
including an address to said one line at the same time°as an 55 ^ and a branch **** buffcr M aforc ? aid for predicting 
associate operation is carried out using the same address, a tar S et address for a to™"* instruction fetched from said 
and controlling re-validation of said one line in dependence memory for execution by said processor, 
on the result of the associate operation, thereby validating BRIEF DESCRIPTION OF THE DRAWINGS 

said line only if no cache hit is found. 

. .Irtoncernbod^iientinresponse to a cache hit in one CAM 60. . nG ' 1 shows a simplified example of a series of instruc- 

Ime/data stored m mat Une *'6^ * ! ****** V. .■ . 

modified by said write operation. Preferably an associate HG - 2 shows a partial schematic diagram of circuitry for 
input of said CAM is connected to selection circuitry implementing the method of this invention; 
arranged to receive both write data and associate data, said FIG. 3 shows a partial block diagram of an embodiment 
. method including, operating said selection circuitry when a 65 of a branch target buffer for implementing the invention; 
write operation is effected -to select said write data for the FIG. 4. shows an illustrative circuit diagram of read ' 
associate input. circuitry of a partition of the branch target buffer of FIG. 3; 
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FIG. 5 shows the content of CAM cells and data RAMs 
of the branch target buffer of FIG. 3 when handling the 
instructions of FIG. 1; 

FIG. 6 shows a first example of updating the branch 
predicted values for an instruction; 

FIG. 7 shows a second example of updating branch 
prediction values for an instruction; 

FIG. 8 shows write circuitry for one line of a partition of 
the branch target buffer of FIG. 3 connected to part of a 
computer system, and 

FIG. 9 shows a computer system including the branch 
target buffer of FIG. 3. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

FIG. 1 shows a series of twenty one-byte instructions 
0-19 in a computer program. As seen in the figure, each 
instruction is represented by an instruction number (0-19) 
and by an address indicative of a storage location at which 
the instruction is stored and which is represented by the 
binary equivalent of the instruction number. The instructions 
are illustratively grouped into sequences of four instructions 
101, 102, 103, 104, 105. Each of these groups is termed an 
instruction word. Each instruction word has a single word 
address, that of instruction word 101 being "0(XT, that of 
instruction word 102 being "001". and so on. Each instruc- 
tion has a byte address indicating its position within the 
word, so that the first instruction of each word has byte 
address *4Q0 W , the second "Ol'Und so on. The majority of the 
instructions, specifically instructions 0-4. 6. 8. 101 11. 13 
and 15-19 lead directly onto the next sequential instruction; 
these instructions are marked *X\ Instruction 12 (marked 
'j'J is termed an unconditional jump, because if execution 
reaches instruction 12 the result is to jump to instruction 16. 
Instructions 5, 7. 9 and 14 (marked € cj*) are each termed 
conditional jumps. The next instruction to be executed after 
a conditional jump typically depends upon a condition 
tested, and may for instance be determined by whether the 
value of the operand, stored as a result of the instruction 
which immediately precedes the conditional jump in time, 
exceeds a given value. The instruction to which the jump 
will occur if the condition is met is indicated in the figure. 
Thus instruction 5 will either be succeeded in time by 
instruction 6, or by instruction 14 depending on the value 
returned by instruction 4. Instruction 7 will be succeeded'by 
instruction 8 or by instruction 0 depending upon the value 
returned by instruction 6. Instruction 9 will be succeeded by 
instruction 13 or by instruction 10 depending upon the value 
returned by instruction 8. Instruction 14 will be succeeded 
either by instruction 15. or by instruction 2 depending upon 
the value returned by instruction 13. 

In some computers, the instructions will be fetched from 
store individually, with instruction 0 fetched first, and then 
instruction 1 fetched, and so on. However, in certain 
computers, it has been proposed to associate plural instruc- 
tions together in sequence called instruction words, so that 
fetching from an addressable location make s, available the 
plural instructions of the word for execution. In the present 
• rt 7 ^ /?^ t i?5' , ^ e . term instruction word is not intended to be 
limitative and is used'ior convenience to refer to" a pfiirafity*' 
of instructions, or one VLIW instruction, which may be 
fetched by accessing a single address in store. 

la the ex ample of instruction words shown in FIG. 1. the 
- • first instruction word 101 contains tfo branch instructions; ie.. 
no conditional jumps and no unconditional jumps. The 
second instruction word 102 contains two conditional jump 
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instructions, instructions 5 and 7. The third instruction word 
103 contains one conditional jump instruction, instruction 9. 
The fourth instruction word 104 contains one unconditional 
jump instruction, instruction 12 and one conditional jump 

5 instruction, instruction 14. The fifth instruction word con- 
tains no jump or conditional jump instructions. 

It will be appreciated that the program flow due to 
executing the first instruction word 101 will always be the ' 
same, namely executing the first instruction 4 of second 

10 instruction word 102. Executing the second instruction word 
102 causes three possible program flows, namely executing 
the first instruction 8 of third instruction word 103 (if no 
conditional jump is effected), executing the second instruc- 
tion 13 of the fourth instruction word 104 (if the conditional 

IS jump at instruction 5 is effected) or executing the first 
instruction 0 of first instruction, word 101 (if the conditional 
jump at instruction 7 is effected). Similarly, the outcome of 
executing the third instruction word 103 is either to execute 
the first instruction 12 of fourth instruction word 104 (if the 

20 conditional junap at instruction 9 is not effected) or executing 
the second instruction 13 of fourth instruction word 104 (if 
conditional jump instruction 9 is effected). Finally, the 
outcome of executing fourth instruction word 104 is either 
executing instruction 16. which occurs if the first instruction 

2j 12 of instruction word 104 has been executed or if the 
conditional jump instruction. 14 is not effected, or the 
alternative outcome is a return to instruction 2 of first 
instruction word 101. 

For optimum speed of operation, the computer, having 

^ fetched a given instruction word for execution, should next 
fetch the correct instruction word." in ie sense of the * 
instruction word containing the next instruction which is 
required to be executed. However, it will be seen from the 
foregoing that the identity of the next instruction to be 

35 executed may vary if the current instruction word contains 
any branch instruction. Take for example fourth instruction 
word 104: 

V 

Instruction word 104 may be fetched either in response to 
executing instruction 9 or instruction 11 of third instruction 

40 word 103 or to executing instruction 5 the second instruction 
word 102. The next correct instruction word to fetch after 
word 104 is either first instruction word 101 or fifth instruc- 
tion word 105. By examining fourth instruction word 104, 
the reader will note that if instruction 11 of third instruction 

45 word 103 were executed, then the first instruction 12 of 
fourth instruction word 104 would be executed next and the 
result of that would be that fifth instruction word 105 is the 
next word to fetch. If however fourth instruction word 104 
were fetched as a result of the conditional jump instruction 

5Q 9 or the conditional jump instruction 5. the identity of the 
next correct instruction word to fetch depends upon the 
outcome of the conditional jump in instruction 14. 

Continuing to consider the fourth instruction word 104, 
fetching a wrong word (for example, first instruction word 

55 101 being fetched instead of fifth instruction word 105) 
would result in the processing pipeline containing a number 
of wrong instructions which would of course not be 
executed. The practical consequence of this would be that a 
number of cycles would be lost, during which no useful 

50 execution took place; only on a subsequent correct cycle 
Wttiftr the 'l&ttStfliltfr -105 : be executed;' 

after being called up into the pipeline. 

A simplified device illustrating some of trie features of the 
present invention will now be described with respect to FIG. 

65 - ' : 

Referring to FIG. 2. the device comprises memory cir- 
• cuitry 200 which consists of a memory array 205 having a 
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plurality of addressable locations and an address decoder 
210 for addressing the memory array 205. The address 
decoder 210 has an input 201 over which it receives instruc- 
tion addresses within a single instruction word. The address 
decoder produces address outputs corresponding to the input 5 
instruction address and all later instruction addresses up to 
the end of the* relevant instruction word- The number n of . 
storage locations in the memory array 205 is at least equal 
to the number of instructions in the program sequence 
currently being run. The memory locations of the array 205 10 
store at the address of respective instructions . information 
showing whether the instruction is a branch instruction or 
not. i.e. a logical 0 where the instruction is not a branch 
instruction, and a logical 1 if the instruction is a branch 
instruction. Each memory location has a respective output 15 
line 220 A -220 n . The output lines 220^220^ are coupled to 
a register arrangement 250. having storage locations corre- 
sponding to each of the lines 220 A -220 A . for storage of 
branch prediction values. Each register location has a 
respective input formed by one of the lines 220 A -220 n and 
a respective output line 280 x -280„. The register locations 
store logical 1 where an associated. branch instruction is 
predicted as taken, and a logical 0 where an associated 
branch, instruction is predicted as not taken. When an 
instruction address is input over input lines 201. the address 
decoder 210 addresses the corresponding locations, of the- 
array 205 and. where an instruction at a corresponding 
address is a branch instruction, there will be a logical 1 
output on the corresponding one of the output lines 
220^220^ which. in.; turn accesses_the register .250 and 
produces : " on an output "line 280; either a corresponding 
logical 1 if the branch is predicted as taken, or a correspond- 
ing logical 0 if the branch is predicted as not taken. Where 
one or more of the addressed locations is not a branch 
instruction, logical zeros will be output over the correspond- 
ing output lines 220 i -220 ll . 

The logic stage stored in register location corresponding 
to non-branch instructions is not significant because no 
logical 1 can occur on a line 220 A -220 n unless the associated 
instruction is a branch instruction. As a result the output for 
each location which corresponds to a non-branch instruction 
will always be logical 0. The output lines 280 A -280 n of the 
register 250 form word lines to a store 300 which stores 
target addresses of branch instructions. The store 300 has a 
first address output 301 at which the store 300 delivers the 
target address of a branch instruction, and a second output 
302 which provides a logical 1 when any branch instruction 
is predicted as taken. The store 300 has one row 300 A -300 n 
for each word line 280 x -280 n , and each row contains 
memory cells connected to the output lines 301 so that 
application of a logical 1 to one of the word lines 280^280,, 
produces an address on output lines 301. Store circuitry 300 
also contains gating circuitry having an output to the second 
output line 302 and producing a logical 1 at output line 302 
when a branch is predicted as taken. The store circuitry 300 
further contains decision circuitry which provides only the 
target address of the first jump instruction from the word 
which is predicted as taken. 

Before operating the device of FIG. 2. it is initialised by 
- sequentially addressing the memory array 205 and.storing a 
logical 1 at the address locations" of memory array 205' which 
correspond to branch instructions. The register circuitry 2S0 
is loaded, during the initialising stage, with prediction 
information indicating whether or not the associated branch 
instructions are predicted taken. The store circuitry 300 is 
loaded with address information- corresponding to the target 
addresses of the branch instructions stored in memory array 
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205. The circuitry necessary for effecting the initialisation of 
the circuitry of FIG. 2 is not shown. 

Once the initialisation has been effected, the program 
sequence can be run. To do this. the. address of the first 
instruction, instruction 0 of first instruction word 101 is fed 
to. input 201 to memory circuitry 200. This causes address- 
ing of the memory array 205 at all locations corresponding 
to instructions in the first word. Because the first instruction • 
word 101 contains no jump instructions, no logical 1 will be 
output. Hence the output lines 220^220,, will all carry 
logical zero, and these logical zeros are applied to the 
register 250. The result of applying all logical zeros to the 
inputs of the register store 250 is to provide on output lines 
280^280,, outputs comprising all logical zeros. The store 
circuitry 300 receives the logical zeros and provides an 
output of logical zero at the first output 301 and an output of 
logical zero at the second output 302. 

In the present example, the first instruction word 101 
contains no branches and thus no target address is output at 
output 302. Accordingly execution proceeds with the fetch- 
ing of the second word 102. 

Accordingly the next instruction word to be fetched is the 
second instruction word 102. For the second instruction 
word 102. the memory circuitry stored a logical 1 at the 
address corresponding to instruction 5 at byte position 2 the . 
byte positions in each word being designated 1 to 4) of that 
instruction word and a logical 1 at the address corresponding 
to instruction 7, at byte position 4 of that instruction word. 
Logical one is output from the memory circuitry over the 
output line 220f-220 n lBbrfe7r^^ ■- 
and over the output line 220 A -220 n corresponding to the 
fourth position in the second instruction word 102. These 
logical one inputs are provided to the register 250 which in 
turn provides logical one outputs over those register output 
" lines 280^280^ which correspond to branch instructions 
which are predicted as to be taken. Thus, the one of the 
register output lines 280^-280,, corresponding to the second 
position of the second instruction word 102 will carry a 
logical one if it is predicted that the conditional jump 
instruction 5 will be taken and the line corresponding to the 
fourth instruction position of second instruction word 102 
will carry a logical one if it is predicted that the conditional 
jump instruction 7 is effected. 

These logical one inputs are provided to the store circuitry 
300 so as to read the target. addresses of the predicted-taken 
branches and the decision circuitry outputs the target address 
of the first occurring predicted-taken branch instruction. 
This is because if, for example, conditional jump instruction 
5 were predicted as taken, instructions 6 and 7 cannot be 
executed if the prediction is correct. Thus the earliest 
occurring predicted-taken branch in an instruction word 
determines the next instruction to be fetched. 
The target address is used by the processor of the 
55 computer, to cause a new instruction word to be fetched 
containing the instruction at which execution is predicted to 
proceed. 

In this simplified embodiment as mentioned previously, 
where execution of the instructions in an instruction word 
60 starts .other than at .the' first instruction of mat word, the 
address of the initial instruction is input over inputs 264 <and>.*^' 
the address decoder 210 only applies addresses correspond- 
ing to the remaining instructions of the word to the memory 
array 205. Thus, for example described with respect to FIG. 
65 l.if execution of the fourth instruction word 104 were ta„ 
• commence at instruction number 13. (for example in 
response to the conditional jump at instruction 9). the 
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address corresponding to instruction 13 would be input over is simultaneously fed to the buffer 400 to carry out a read 

input lines 201. and address circuitry 210 would then apply operation and if the processor executes a branch instruction 

the addresses of instructions 13. 14 and 15 to the memory which is not in the buffer data relating to that branch 

array 205. As a result, the branch instruction at instruction 12 instruction will be written into the buffer after execution by 

(which is an unconditional jump, and is always 'therefore 5 the processor so that it is ready for use at some later time 

predicted as taken) would not be presented on one of the To write data into the buffer a word address input bus 900 ■ 

memory array output lines 22V220 n . is provided to write'word addresses into the' CAM arrays, a 

' The above description is intended to illustrate the con- target address input 90S is provided to write, into the data 

cepts of this invention. One embodiment of circuitry for RAMs target addresses for the branch instructions which are 

carrying out the method of the present invention, this 10 newly written into the buffer. As will be later described, any 

circuitry including a branch target buffer, will now be write operation into the buffer is carried out simultaneously 

described with reference to FIGS. 3-9. with a read operation using" the same word address and input 

Referring firstly to FIG. 3. a branch target buffer 400 900 as a read ^put 500 together with the appropriate byte 

consists of four generally similar sections 401-404. each address 501 to avoid writing into the buffer any instruction 

referred to hereinafter as a partition. 15 wn i c h is already held in the buffer. Other inputs and outputs 

The number of partitions of the present embodiment showniDF IG. 3 and their function will be described with 

corresponds to the number of instructions per instruction reference t0 the ^ore detailed drawings of FIGS. 4. 8 and 9. 

word.' and in the example described above with reference to As will be seen with reference to FIG. 3, each of the 

FIG. 1. this number is 4. It will of course be understood that partitions 401-404 is substantially similar and therefore a 

other numbers could be used according to the content of the 20 bailed description of one exemplary partition 401 only wiil 

instruction word; specifically in a more complex system 8 Dc S iven - 

partitions could be used. It will also be appreciated by one Referring to FIG. 4. partition 401 consists of n content 
skilled in the art thatfewer partitions could be provided. For addressable memory cells 510^510,, coupled to the instruc- 
example. it would 'be possible for the four-byte example of . tion word address input bus 500. having a first plurality of 
FIG. 1 to only provide two partitions, one corresponding to CAM output lines 511,-511,,. As will be later described 
the first two bytes of each instruction word, and the other herein, the present embodiment of the branch target buffer is 
corresponding to the second two bytes of each instruction operated dynamically, in the sense that once all of the lines 
word, except one "contain data the partition is regarded as 4 1ull". 
~ In, the present example. first partition 401. stores data..^ ^ c , one unfilled line is retained for writing a new entry, and 
relevant to. branch, instructions which are located at the first : a * part of the write ojperaironTdeBlIslif .one branch' instruc-""."" 
byte of a word stored in the buffer, second partition 402 *ion stored in the branch target buffer are discarded so as to " 
stores data relevant to the branch instructions at second byte De read y for input-of a next newly-found branch instruction, 
locations and so on. First partition 401 is referred to herein A branch instruction is said to be "newly-found" if it does 
as the lowest partition, and fourth partition 404 as the 35 not exist in the branch target buffer at the time of testing for 
highest partition. presence of the branch in the buffer. It will be noted that a 
Each partition of the buffer 400 includes a CAM array bra nch instruction which was discarded from the branch 
holding a plurality of word addresses and associated data target buffer one of operation may become a 
RAMs holding target addresses for branch instructions "newly-found" instruction during a later cycle of operadon. 
located at the word addresses stored in the CAMs. The buffer 40 Thus ' whercas increasing the number n of content address- 
has fust and second input buses 500.501 giving the address ablc . memorv ccUs increases the complexity and size of the 
of an instruction being fetched. The first input bus 500 is an deYic f* suc k an increase tends to reduce the number of 
instruction word address bus receiving the most significant occasions on which a branch instruction nee.ds to be written 
bits of the instruction address and the second input bus 501 LXL ' 

is an instruction byte address bus for the least significant bits 45 Decreasing the number n of memory cells provides a 

of the address of the instruction being fetched. In the smaller and simpler device, but with the penalty that the 

example of FIG. 1. the first three bits of the instruction chances of failing to find a branch instruction are increased, 

address form the instruction word address and the lowest Where a jump instruction is not found this tends to lead to 

two bits form the instruction byte address. An input on bus a processing delay. 

500 therefore enables an associate operation to determine if 50 In the presently described embodiment, content address- 

the buffer holds data corresponding to the word address of able memory cells 510,-510,, are capable of storing the 

the instruction being fetched. The input 501 is used to word addresses input over the instruction word address bus 

control the output of the CAM arrays during a read operation 500. As will be described later herein, certain instruction 

so as to form a m as k i n g operation which prevents output word- addresses are stored in the content addressable 

from any partition CAM corresponding to a byte location 55 memory cells. The instruction word address of the instruc- 

within the word before the byte position input on line 501 tion currently being fetched is input over the instruction 

(that is when the input on 501 is greater than the partition word address bus 500 and when an instruction word address 

number). If a read operation in the buffer 400 has a hit for corresponding to one of the stored addresses is input a 

the relevant instruction identified by the inputs 500 and 501. logical 1 occurs on the corresponding one of the CAM 

the. buffer. provides a. target address output on bus .507, an ^ output lines 511^511^. 

oUtptr 505;* Dciifg Vb^caxrying Mormatiofi- mdicatin^'" * : '^(&i ; Wc^5ll' has* an input connected to : the' " 

winch partitions contain branch instructions from the current instruction byte address bus 501 and is connected to the 

mstruction word which are predicted taken, and an output CAM output lines 511,-511,, so as to selectively disable all 
^•^^ kf0£mati0a ° n MDrcdiction strc ngth". ofthe lines of apartition. The selection circuitry 512 has a 
ootained. from each.partmon, . ^ pr0CCS sing ciFajit receiving one input from>e jnstruc- 

It will be understood that when a processor outputs an tion byte address bus 501 and having an output 561 con-~ 

address to memory to fetch a new instruction, that address. . nected to the control inputs of a plurality of pull down 
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transistors 562 A -562 n . Each of the pull down transistors is 
connected between a respective one of the CAM output lines 
511 and earth. During reading, a control input 912 
sets the processing circuitry 560 to produce a high output so 
as to turn on all of the pull down transistors 562^562,, in 
response to the byte address input over the instruction byte 
address bus 501 which is greater than the number of the 
partition! In this way the circuits 512-812 act to mask out 
these partitions where the stored branch instruction has a 
byte location before the byte position indicated on line 501. 
The selection circuitry 512 has output lines 513^513,, 
which form inputs to prediction processing circuitry 514. 
The prediction processing circuitry 514 includes an 
n-location store 563 for storing prediction information for 
indicating whether an associated branch instruction is pre- 
dicted taken or not taken. Each of the store locations 
563^563,, corresponds to a respective CAM cell, and has an 
output connected to one input of a respective two input AND 
gate 564 A -564 rt . the other input of which is provided by a 
respective output line 513 ^513^ of the selection circuitry 
512. As shown in FIG. 8. an OR. gate 565 is connected 
between the respective store locations 5 63 and the respective 
AND gate 564^564,,. but as this forms a direct connection 
during reading it is omitted from FIG. 4 for clarity. As will 
later be described herein, the prediction store 563 stores a 
logical 1 in positions corresponding to branch instructions 
which are predicted as taken, and a logical 0 in positions 
corresponding to branch instructions which are predicted as 
not taken. A line 907. described more fully with reference to 
FIG. 8, allows the writing of new prediction information to 
, the. : prediction .store. Each. AND .gate:.564 A -5.64 n .has_a_. 
respective output which is connected to a respective output 
line 515 A -515 n of the prediction processing circuitry 514. As 
shown in FIG. 8. there is an OR gate 905 between the output 
of each AND gate 564^564 n and the output line 515,-515,,. 
but as this forms a direct connection during reading, it is 
omitted from FIG. 4 for clarity. The output lines 515^515,, 
form the word lines to a RAM 518 and also form inputs to 
an n-input NOR gate 516. The NOR gate 516 is connected 
to a single output line 517 which controls a first transmission 
gate 531 and a second set of transmission gates 530. The 
NOR gate output lines 517-817 from all of the partitions 
together form the output bus 505. As shown in FIG. 8 the 
output of NOR gate 516 is connected to OR gate 590 which 
is omitted for clarity in FIG. 4 as it has no effect during a 
read operation as the signal on line 920 is 0 during a read. 
During a write, the gate 59a in each partition receives a 
signal 1 on line 920. 

Data RAM 518 consists of n rows of storage cells, each 
row of which is addressed by a respective one of the lines 
515^-515,,. Each row of storage cells stores the target 
address of a branch instruction i.e. the address at which 
execution will continue if the branch is taken. This infor- 
mation is made available on target address output bus 503, 
which passes between the partitions, but which contains the 
transmission gates 530 for isolating partitions storing data 
relevant to later bytes in the same word as will be later 
described. Bus 503 is connected to sense amplifier circuitry 
506 having a target address output 507 and a target address 
input 903 used during a write operation. The bus 503 forms 
part of a .common data path bterconnectino tiie data RAMs 
with'rcsjpec^ 

and with transmission gates in the bit line paths between 
adjacent data RAMs. 

The branch target buffer, also consists of prediction- 
strength, processing circuitry .55 <L . " 

The prediction-strength processing circuitry 550 receives, 
as fust inputs, the output lines 513 1 -513 n .of the selection 
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circuitry 512 and produces an output on prediction-strength 
output line 504. The prediction strength processing circuitry 
550 further receives as a control input the update strength 
line 502. and also receives inputs from a line 909 for writing 

5 new prediction strength data. This latter is described with 
reference to FIG. 8.. 

Referring again to FIG. 3. and as mentioned above, the 
remaining, partitions have similar structures to that of the 
first partition 401. The integers of each partition which are 

Q similar to those of partition 401 have similar reference 
numerals, the reference numerals of partition 402 being in 
the range 600-699. those of partition J03 being in the range 
700-799 and those of partition 404 being in the range 
800-899. 

15 As previously described, between each respective parti- 
tion and the adjacent partition there is a first transmission 
gate 531. 631. 731. 831 and a second set of transmission 
gates 530. 630. 730. 830. Both the first transmission gate and 
the second set of transmission gates are controlled by the 

20 output lines 517. 617. 717. 817 of the respective NOR gate 
516. 616. 716. 816 of the associated partition. In any one 
partition, the operation is such that when the respective NOR 
gate 516. 616. 716. 816 receives a logical 1 at any one of its 
inputs, the corresponding output Luie 517.' 617- 717. 817 

25 goes to logical 0. thus rendering the respective first trans- 
mission gates 531. 631. 731. 831 and the respective second 
set of transmission gates 530. 630. 730. .830 non-conductive. 
This has the effect of interrupting the target address output 
bus between the relevant partition and the next higher 

30. partition. Jne re suit j s thanjnjy the Lowest partition in which 
there is a logical 1 for the input ofthe NOR gate 516. 616, 
716, 816. can supply a target address onto the target address 
output bus 503. The output RAMs 518-818 are therefore 
connected serially to a common path to the output 503 and 

35 the control gates 530-830 in that path allocate a decreasing 
priority to partitions progressively moving away from the 
output 503. Any one partition can only provide an output if 
no higher priority partition is providing an output Once one 
partition provides an output all lower priority partitions are 

w blocked and cannot provide an output 

The operation of the circuitry of FIG. 3 will now be 
described: 

Data relating to branch instructions are stored in a parti- 
tion which corresponds to the byte position of the branch 

45 instruction in its instruction word. For example, data relating 
to a branch instruction in the first byte position of any word 
is stored in first partition 401. data relating to a branch 
instruction in the second byte position in the second partition 
402 and so on. Hence for the example of FIG. 1. partition 

so 402 is relevant to instructions 0. 4. 8, 12 and 16 (the first byte 
address of each instruction word), the second partition 402 
stores and processes information relevant to instructions 1. 
5. 9, 13. 17 and so on. Data indicating the presence of branch 
instructions is stored in the CAM cells of the partition, data 

55 relating to the prediction as to whether the branch is taken 
is stored in the prediction processing circuitry and data 
relating to the target address of the branch instruction in the 
RAM of the partition. In the presently described example, 
the data which indicates the presence of branch instruction 

£0 is the full instruction word adaress of the word containing 
me branch instruction; will be later described, herein. 'Is'* 
possible to store only a part of the instruction word address. 
The data chosen for indicating the presence of branch 
instructions is stored in the content addressable memory 

65 cells of the respective partition. . 

-The location of. data identifying branch instruction?* is. 
shown in FIG. 5 for the FIG. 1 example. Referring to FIG. 
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5. it will be seen that cells 510 of the first partition 401 not. and plural AND gates S64. The AND gates receive one 

contain an entry of "Oil" because the instruction word 104 input from the associated selector circuitry and one from a 

(having address "Oil") has an unconditional jump instruc- respective entry of the register. Thus, considering instruction 

tion in the first byte position. Similarly the second and third No 7 of FIG. 1. the prediction processing circuitry 814 of the 

instruction words 102.103 (word addresses "001" and 5 fourth partition will either produce a logical 1 from a 

"0 10") have conditional jump instructions at the second byte respective AND gate on one of the output lines 815 (if the 

position of the instruction words, namely the positions branch instruction 7 is predicted as taken) or will produce all 

corresponding to instructions 5 and 9. Thus, the cells 610 of logical 0'sjon the output lines 815 (if the branch instruction 

second partition 402 store word addresses "001" and "010" 7 is predicted as not taken). 

^rv^i 02 ^ 103 -^^^^ 1 ^^^* 6 ^ io For the first situation, namely that where the branch 

partition (for address "011" corresponding to miction 14; iQStruction is ^ as ^ * e feence of 1q ~ 

and in the fourth potion (corresponding to instruction 7;. on ^ output Une 815 which ^cs^nis to the position of 

Still refemng to FIG. 5 in the context of the instructions the branch instruction information stored in the content 

shown in FIG. 1. an input is made to the instruction word addressable memory 810 causes- 

^ S L b K S - 50 ?A f ^ W °l addr " S O f. t ?t in ^ Cti0n » 1. The output of NOR gate 816 to go to logical 6 and; 

currently being fetched; this address is applied by the bus to ? yu, * nf n . f - „ " / 

the CAM celis 510. 610. 710. 810 of the whole device. Z SlfTS^T^ . i . RAM 818 to be 

Where a match occurs between the word addresses inpuf to ? ZZST."**" ° Ut " 

the bus 500 and an address stored in a CAM cell, a logical ^ SIt2 £tX n °° ' "** 

one output (referred to as a "CAM hit") is provided bv the v Tt L£h \^ • • • « 

relevant output line of the CAM. If for example the word M Jilt dni^n?* -n k T^P, ^ s f^^ 

address "000". corresponding to word loTis input, no P^°^ 401^03. mere wm be no logical l's at the output 

inetniMiMi a iai i • . 'u VaVv . logical 1 output which ensures mat the corresponding trans- 

mstruction word 102 is input over bus 500. a match occurs „, t f. «a z-m en > i f 

in the CAM cells 610 of the second partition (corresponding * ^ SSJSiS? f Z'^Ti V ^- 

K£c^^ 

A . . 4 . . _ . ' _ M word 104 in the fetched sequence- is the first instruction 

An uput is made to the- byte input address bus 501 (instruction 12) to be executed, 

comprising the instruction byte address, which represents Now , refcrcnce t0 na x shows instruction 12 

fteposmon Wiethe instni^ (havi addrcss Qim) b a non . conditlonal . 

is to commence For the example shown in FIG. 5. assume 35 Accordingly, every time execution proceeds to this 

that execution of the program segment shown is to start at instruction, the jump is taken and provided instruction 12 is 

(he third instruction of second instruction word 102 stored h ±c branch ^ t buffcr f ^ branch ^ 

(mstrucuon 6in HG. l) . to to^atioiu the wort address circuitry will provide a logical 1 on one ou°tput line 

input oyer bus 500 would bc"00V corresponding to the 515^515. of pr^dicdon processing store circuitry 514 in the 

second instruction word 102. This would produce CAM hits «, first partition 401. It will be appreciated that the fact of 

in fee second and fourth partitions. Referring again to FIG. taking branch instruction 12 means that execution will not 

1 it wiU be seen that the byte address for the third instruction directly- proceed to instructions 13, 14 or 15. As a result the 

of each instruction word is "10". It is this address which is prcdicted outcome of branch fr^ou 14 is irrelevant 

input on instruction byte address bus 50L because ^at wB not bc cxecuted m sequence 

The selection circuitry 512, 612, 712, 812 includes 45 with instruction IX The branch target buffer takes this into 

respective processing circuitry 560 (see FIG. 4) which acts account because the NOR gate 516 of the nrst partition 

during reading to disable any partitions which correspond to provides a logical 0 output over its output Hne 517 to render 

bytes lower than the byte address input on instruction byte non-conductive the transmission gates 530. 531. thus iso- 

address input bus 501. Thus, for an input byte address of latingthe second-fourA pardons 402, 403, 404. The second 

"00" no partitions will be disabled, and all selection circuits 50 set of transmission gates 530 prevents the output of target 

will pass any hits from input 511.611 etc to output 513.613 data from the data RAMs of the other partitions and, as will 

etc. If the instruction byte address is "OF then selection later be described herein, the first transmission gate 531 

circuitry 512 of the first partition 401 will not pass any hit prevents the prediction strength of predicted unexecuted 

from an input Hne 511 L -Sll n to an output line 513^513,,. branch instructions from being updated. Although this pri- 

whereas in other partitions, any hit will be passed from input 55 oritising feature has been described for the unconditional 

to output. If the instruction byte address were "10" then any branch instruction 12. it should be noted that the branch 

hit would only be passed by selection circuits 712 and 812 target buffer is not aware that 12 is different to any other 

in the third and fourth partitions, and so on. In the present predicted-taken branch. The buffer will therefore treat any 

case, the instruction byte address is "10", thus enabling only other predicted-taken branch in the same way. i.e. act to 

^election circuitry. 712 and 812 to pass any hit. Hqwever. the 60 exclude any output for,Jaje£kra.nches ,in the same wor4. 
only hit which occurs is'in fce"foiLr& partis 

therefore allowed to proceed as an input to the prediction don word 102 is to commence from the first instruction of 

processing circuitry 814 of the fourth partition. that word (instruction 4 of FIG. 1. having address "00100"). 

It will be recalled that the prediction processing circuitry . Assume for the purpose of this example that instruction 5 ■ 

■ 514.614.714 and 814 each consists ofxespective prediction 65 (00101) is not predicted as .taken and that- instruction . 7 

store 563 which contains a logical 1 or logical 0 indicating - (00111) is predicted taken. The instruction word address is 

whether a respective branch is predicted as being taken or input over instruction word address input bus 500, which 
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will produce logical 1 CAM "hits" in the second and fourth 573i-573 A . The pull down transistors 573 A -573 rt are con- 
partitions 402. 404. The byte address Wis input over nectcd to a corresponding entry 571 ^571^ of the prediction 
instruction byte address input bus 501 to the selection strength store 571. The output of the AND gates 570.-570 
circuitry 512^612 712. 812 of the partitions As mentioned form - mputs t0 an nA t 0R te 574 havin A ut 
above.the 00" address causes each of the selection circuitry 5 which includes an output prediction strength line for each 
to pass hits from input to output to provide corresponding partition. 

logical l*s to the prediction processing circuitry. In the . In operation, when a logical 1 appears on one of the input 

presently described example, such logical 1 s occur in the Unzs Sl^SU n (indicating that a branch instruction has 

second and fourth partitions 402 and 404. In the second beeQ w fa ^ partition) fa { ical j h Ued tQ one of 

partition 402. the relevant repter of the prediction process- l0 the AND gates 570^570,,. The AND gate will produce 

ail of the outputs of the prediction processing circuitry 614 prediction strength store 571 stores a-logical \ correspond- 

remain logical 0 indicating no branch predicted taken. By ing t0 a . Veakr prcdiction . H ^ theQ one J f the 

contrast, the predicaon processing circuitry 814 stores a inputs to ^ 0 R gate 574 will be at logical 1 and the output 

/taken prediction in Jhe position ^corresponding to word 102 l3 ^ 504 has a logical 1 indicating the prediction is weak 

and the applicauon of a logical 1 Jhit provides a jump taken The logical 1 on the one of the input lines 513^513 is also 

output on one of the lines 815. This: . applied t0 0QC input of a rc$pcctivc Qnc of ^ ^ 

a) Provides a logical 0 at the output of the associated NOR 572 x -572 n . 

gate 816; and >Thc ou ^ ut of ^ gate 572^572,, will be at logical 

b) causes the data RAM 818 of the fourth partition to 20 0 unless the update enable line 502 is at logical 1 which 
output a target address of "00000". As none of the causes an automatic update of prediction strength to an 
"lower** partitions has a non-conductive transmission interim new value. In this event one of the AND sates 
gate, this target address is passed through to the target 572,-572,, will have a logical 1 output, which causes the 
address output bus 503. associated one of the pull down transistors 573 x -573 n to turn 

As a result it can be seen that the branch target buffer 25 on. pulling the corresponding stage 571 i -571 rt of the pre- 

described above identifies only the first predicted-taken diction store 571 to be pulled to logical 0. thus causing the 

branch instruction of a sequence, which is not excluded for prediction value stored in that stage to change to logical 0 

execution by being prior to the first instruction of the indicating a strong prediction or remain at logical 0 ifit was 

sequence to be executed, and. more specifically, the target already in that state, 

•^address of .that instruction. . jq ..as previously, noted, there is -a. respective transmission .. 

As previously, discussed with respect to FIG. 4. each ' gate 531,631 etc'connectedin'the update enable line 502 *" 

partition of the branch target buffer includes circuitry 550 between each partition and the next higher partition. It will 

known herein as prediction strength processing circuitry for be recalled that this transmission gate is conductive unless 

storing information based on the history of the branch me relevant partition has identified that a branch instruction 

instructions identified in the corresponding partition. As 35 is predicted taken. In that event, the transmission gate is 

previously noted, FIG. 4 represents an exemplary partition rendered nonconductive during the operating cycle by the 

401 and the prediction strength processing circuitry 550 has respective line 517. 617, 717. 817 going to logical 0. At the 

counterparts 650. 750, 850 in the other partitions. start of each cycle, the transmission gates are conductive. 

Prediction strength processing circuitry 550 receives and a logical zero is applied via the line 502 to all partitions 

inputs from each of the selection circuitry output lines 40 as a precharge leveL The logical zero is then disconnected 

513 A -513 rt . A logical 1 will occur on one of those output but the line remains at that level Once transmission gates 

lines 513 A -513 fl when a match occurs between the word 531. 631 have gone non-conductive in partitions where a 

address input on bus 50G and data indicating the presence of branch is predicted taken, a logical one is applied to the line 

a branch instruction stored in one of the CAM cells 502. The consequence of this is that a logical 1 on the update 

510 1 -5l0 n , provided the selector circuitry 512 has not 45 enable line is input to each partition in ascending order up 

disabled the partition because the.instruction in that partition to and including any partition in which a branch instruction 

is prior to a first executed instruction of the relevant is predicted as taken. Later partitions, regardless of whether 

sequence. Each of the inputs provides a first input to a or not they contain predicted-taken branch instructions do 

respective AND gate 570 A -570 n , The other input to each . not receive the logical one level needed to update the 
AND gate is derived from a respective entry 571J-571,, of 50 prediction and thus are not automatically updated, 
a store 571 in the prediction strength processing circuitry If during a fetch operation an instruction word is recog- 

which stores information indicative of whether there is nised as having plural branch instructions, it is desirable that 

associated with the corresponding instruction a so-called the above-discussed automatic updating take place for all 

"weak** prediction or a so-called "strong" prediction. In the those branch instructions which are not excluded from 
present embodiment, a weak prediction is represented by a 55 execution by virtue only of the initial execution point within 

logical 1 stored in the corresponding stage, and a strong the word, up to and including the first predicted-taken 

prediction is a logical 0 stored in the corresponding stage. branch. As an example, if all of the partitions 401-404 stored 

A strong prediction indicates that a high degree of con- a jump instruction for a particular instruction word and if 

fidence that the presently stored prediction is correct, and execution were to start from the branch instruction in 
_ r Jheje/Q^.unlikelyto 60 partition 402 (predicted not-taken) and the instruction in 

indicates^ lower degree of confidence in "the correctness of "partition'403 wer^jpreaW 

the present prediction, and a greater likelihood of change. performed on the predictions stored in partition 401 

The lines 513^513,, also form one input to respective (because this instruction could not be executed) and no 

AND gates. 572,^572^. The other inputs to the AND gates update should be performed upon the instruction represented 
^ . 572^-572,, are. provided by the up.date enable line 502. 65 by partition 404 (because I the instructioa represented by 

which is common to all those gates. The output of each AND partition 403 is predicted as taken, thus preerr^ting any ' 

gate 572 A -572 n controls a respective pull down transistor judgements on the instruction represented by partition 404). 
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This update strategy is provided for in the branch target instruction, i.e. the start of the cycle C x Similarly. 

buffer in the following way: represents the second occasion at which the branch instruc- 

Since in this example the byte address input over byte tion is fetched, and the resolved prediction for cycle C, 

address input bus 501 causes the selection circuitry 512 to represents the predicted value for cycle and so on In 

disable the first partition 401. there will be no logical Vs on 5 each cycle, the prediction is updated to provide an interim 

any of the lines 513. Thus no logical ones will be applied to update by assuming that the prediction at the start of the 

the prediction strength circuitry 550 and the output of the - ' : cycle was correct;.If execution of the instruction shows that 

.NOR gate 516 remains at logical i. Gates 530531 remain the prediction was in fact wrong, the interim update is 

conductive. Although there will be a CAM hit in the second discarded, and a new prediction made. This new prediction 

partition 402. and the selection circuitry 612 allows the io is based upon the previous prediction, at the start of the 

corresponding logical 1 to be applied to the prediction cycle, and the knowledge gained during resolution, 
processing circuitry .614. the 'not-taken' prediction has the In the presently-described embodiment, a branch instruc- 

consequence of providing all logical 0's at the lines 615. tion is put in the BTB only when it is executed to cause a 

Thus NOR gate 616 has a logical 1 output and gates 630.631 jump from the normal sequence. Thus at the end of the entry 

are conductive. The appearance of a predicted-taken result in 15 cycle, after resolution, the history of the jump instruction is 

the third partition 403 causes the NOR gate 716 of that "T" (Taken once). 

partition to provide a logical 0 output, thus rendering non- Referring to FIG. 6 the progress of a jump instruction 

conductive the transmission gate 731. Hence the update line which is initially correctly-predicted will be described: 
502 is connected to positions 401^03. and disconnected At the start of cycle C r the prediction is Weakly taken" 

from partition 404. No updating in partition 401 occurs 20 and it is assumed by the branch target bufler that the jump 

because a logical 1 isrcquired on one of the lines 513 for this will in fact be taken. Thus the prediction state is updated by 

to occur. the update enable line 502 and the automatic update circuitry 

The prediction strength information, as mentioned above. 572. 573 etc to "strongly-taken". In this case, the jump is 

provides certain information on the history of identified taken. There is thus no need to correct the in terim prediction, 

branch instructions, 25 which becomes the resolved prediction at the end of the 

Since the prediction of the outcome of a branch-4.e. cycle which includes fetching and executing the instruction, 

whether or not a jump instruction is taken— is required at the At the. start of cycle C>. the prediction is "strongly taken", 

time of fetching, rather than executing, the branch and the jump is resolved as. "taken". The interim update 

instruction, it is necessary to update the prediction for use prediction remains strongly taken and. as the jump is 

.the next time .that, particular, jump, instruction is fetched. 30 -resolved as being-taken, the resolved prediction is likewise 

depending on* whether or not the prediction currently being "strongly taken". 

made is found to be correct or not during actual execution. However, if in cycle C, the prediction is incorrect, in that 

The strategy adopted in the presently-described embodiment the jump is resolved as being not taken then the following 

is to update automatically to an interim new prediction on applies: 

the assumption that the present prediction is. in fact, correct 35 At a start of cycle C 3 the resolved prediction was 

If the interim prediction is found to have been correct when "strongly taken" and die interim update is thus "strongly 

the branch is resolved, then the interim prediction is retained taken". However, as the jump is resolved as "not-taken" the 

as the new prediction for the next execution, referred to prediction requires updating to 'Veakiy taksn" as shown, 

herein as the resolved prediction; only if the present predic- Thus it will be seen that the strength not the prediction is 
tion is found to have been incorrect is there a need for 40 changed. 

correction and a corrected 'resolved prediction' is stored. For the next cycle, C 4 . the prediction value is still "taken". 

The branch target buffer defaults to a state in which no although Weakly-taken". Thus the interim update will be 

branch is predicted taken for any cycle in which corrections from "weakly-taken" to "strongly-taken", on the assumption 

are being undertaken. This enables the branch target buffer ' that the predicted behaviour is correct. If however once 
to be implemented as a single-port device, the single port 45 again the jump is resolved as being not-taken, the resolved 

being alternatively used for reading out of predictions and prediction must be corrected to %1 weakly-not-taken", in other 

writing in of predictions. words the correction between the resolved prediction of 

Referring to FIGS. 6 and 7. the process of updating the cycle C 3 to the resolved prediction of cycle C A requires the 

prediction and prediction strength will be described. Both prediction to be changed rather than the strength to be 
FIGS. 6 and 7 show the history of a branch instruction which 50 changed i.e. from "weakly-taken" to 4 *weai3y-not-taken". 

is entered into the branch target buffer during an entry cycle This change is made by an associative look-up in the present 

(E). For this explanation, it is assumed that each time the embodiment. 

branch instruction 'is fetched, the previous execution of the . Finally at the start of cycle C 3 the prediction value is 

instruction has been resolved In practice, it may be possible "weakly not taken", and accordingly the "interim update" is 
for the instruction to be fetched again before a previous 55 to "strongly not taken". If the jump is resolved as "not taken" 

execution has been completed, as will be later described the resolved prediction is "strongly not taken" whereas if the 

herein. On entry, the prediction and prediction-strength for jump is resolved as taken, then the resolved prediction 

any newly-entered jump instruction is "weakly-taken" (wT). would be "weakly taken". 

The prediction and strength are stored as a logical 1 Turning to FIG. 7, the progress of a jump instruction is 
• <^iff£ing~^ 5*3. : 663. : X6X.£G. shown, in which the first, resolution of ..the jump instruction. 

863 and as logical 1 (indicating "weak") in the prediction ' after entry into the branch'tafgei t'tiffer is inCtiri&t^uif.iii' 

strength store 591 etc. to form, the prediction for that cycle C t the interim update, assuming that the prediction 

instruction when it is executed next time. value of **weakly taken" is correct, is to ^strongly- taken". 

In botb.FIGS. 6 and 7. cycle C x represents the next cycle However, the jump is resolved as "not- taken**. As a result. 
-irLWhicLthe presently-consideredinstruction is fetched, and 65 the resolved prediction .UJ*wealdy-iipt-ta^n" r in other 

the resolved prediction for the previous cycle represents the - words requiring the prediction at thestart of the cycle! which 

prediction state at the time of next fetching of the was "weakly taken" to be corrected to "weakly not taken". 
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At the start of cycle C 2 the "weakly not taken" prediction via a second transmission gate 910*. The output of the 

gives an interim update of "strongly not takeo". and the second OR gate 906* provides the control input to the 

resolution of the jump as "not taken" confirms this value as transmission gates 918* and 910*. The second input to the 

the resolved prediction. However, in cycle C 3 . although the second OR gate 906* is provided by the output of an AND 

interim update is to remain , at "strongly not taken", the 5 gate 91 1* which receives a first input from the corresponding 

resolution is "taken" thus the resolved prediction at the end output of the selection circuitry 512 and a second input from 

of cycle C y is "weakly not taken", i.e. once again changing * an input line 912. 

only the strength of the prediction, rather than the prediction As previously mentioned, the write enable line 901* is 

itself. This should be contrasted with cycle C4. in which the operative to allow an address input over write bus 900 to be 

predicted value of "weakly not taken", at the start of the 10 stored in the CAM cell 510* : At all times, one line in each 

cycle is updated to "strongly not taken", but as a result of a partition comprising a CAM ceiL a liire of the target address 

resolution to "taken" is finally corrected to "weakly taken". RAM and the associated prediction'and prediction strength 

in other words a change of prediction value, not of predic- data is identified as invalid, and it is this invalid line to which 

tion strength. a new branch jump instruction is written. To render the 

The above discussion of the branch target buffer generally 15 partition line shown in FIG. 8 invalid, an input is applied to 

relates to the circuitry in operation of the buffer in the control input 904* so as to set the latch 903* and render 

reading mode. Details of the circuitry for writing informs- conductive the pull down transistor 902^. The effect of this 

tion to an exemplary line of the branch target buffer will now is to pull the CAM output line 511* down to earth. Trie CAM 

be described with reference to FIG. 8. FIG. 8 shows an output line is then held at earth until a new predicted value 

exemplary content addressable memory cell 510*. being one 20 is written into this line and the line is marked valid, as later 

of the memory cells 510 in the first partition 401. This described. 

content addressable memory cell 510* for the purpose of To write a new predicted-taken entry to the line shown in 

illustration is shown as having four storage locations for FIG. 8. it is necessary to write information to the content 

storing a four bit instruction word address. The content addressable memory cell 510*. to the target address row 

addressable memory cell 510* has an output line 511*. upon 25 518* to the prediction store 563* and the prediction strength 

which there appears a logical 1 "CAM hit" when an address store 571*. In the present embodiment the prediction for a 

input over the instruction word address input bus 500 rinds newly-entered branch is 44 taken" and the prediction strength 

a match in the CAM cell. is selected to be 14 weak"; thus the overall prediction level is 

The output line 511* passes to the previously-described '•weakly-taken". To accomplish this, the circuit 112 responds 

selection" circuitry 512 to an output Ike 513*: In the writing 30 to an entry- command from the processor 110 to -cause the 

mode, input 912 changes state from the reading mode so that first input line '907 to supply a logical 1 and the second input 

the selection circuitry 512 provides a connection between its line 909 to supply a logical 1. When the write enable line 

input and output where the partition is to be written to. but 90 l*goes to logical 1. the Output of the second OR gate 906* 

for all other partitions, the circuit 512 provides no.connec- goes to logical 1, thus enabling the transmission gates 918* 

tion between input and output and those outputs are held to 35 and 910* and causing the logical 1 on line 907 to be written 

ground. The content addressable memory cells are not only to prediction store register stage 563* (representing **taken") 

connected to instruction word address input bus 500. but and the logical 1 on second input line 909 to be written to 

also to instruction word write bus 900. This bus is connected the prediction strength stage 571* (representing fc *weak"). 

to each of the storage locations of the content addressable The address data presented on the write bus 900 is written 

memory. Each of the storage locations of the content addres- 40 to the CAM cell 510*. as has previously been mentioned. A 

sable memory 510* is also connected to a write enable line logical 1 applied to line 920 ensures that OR gate 590 

901* which receives a logical 1 when the address currently outputs a logical 1. thus rendering conductive the transmis- 

appearing on the write bus 900 is to be written to the content sion gates 530. The presence of the logical 1 on the write 

addressable memory cell 510*. A separate write enable line enable line 901*. provided at the input to the first OR gate 

is provided for each line of each partition. A further control 45 905* however provides a logical 1 on the word line 515* to 

line 920 connected to all partition lines is provided to allow the RAM line 518* and this allows target address data to be 

a new target address to be written to the RAM 518. written into the corresponding line 518* of the RAM 518 

The CAM output line 511* is connected to a pull down from a target address input signal 908. 

transistor 902*. operable to pull the CAM output line to To ensure that there is always an empty line available for 

earth. The pull down transistor 902* is controlled by a 50 each partition, every time a new branch is entered into the 

control latch 903* which is selectively set or reset by an branch target buffer a random number generator 113 ranr 

input 904* to mark the line valid or invalid. The write enable domly selects a line in the partition which is to contain the 

line 901* is also connected to one input of a two-input OR . new branch and inputs a bit to the corresponding latch 903 

gate 905* whose other input is provided by the output of the so that the line is marked invalid. The line to which the 

AND gate 564* of the prediction processing circuitry 514 55 branch instruction is written is selected using a stored value 

and the output of the OR gate 905* forms the word line 515* of the previously-selected random number. This technique 

for row b of the RAM 518. The write enable line 901* is requires a fust row decoder 115 to output signal 904 to mark 

further connected to one input of a two-input OR gate 906*. a line as invalid and a second row decoder 114 to output 

Register entry 563* of the prediction store 563 is connected signal 901 to add the new branch. The random number 
:- ttVt.fp^j^Of .SJ&fiS? ..^^^A'V-^^l^y? 1 ^? rDCW * 60 .g cnc p l0 , r ' U 3 is conftgurcjltp ensure that^the same nurjober 

prediction to be written in" 'from 'the* circuit 102" via* a" is not generated rwice**s"ucCessiveiy.' ' v *" ** 

transmission gate 918*. The output of the prediction store FIG. 9 shows schematically a computer system including 

563* is coupled via an OR gate 565* to the AND gate 564* the branch target buffer 400 already described. A processor 

The second input of the OR gate 565* is provided by the 110 is arranged to fetch instructions from a memory 111 and 

control-line 920. Prediction strength storage location571* is 65 execute a plurality of instructions in a pipelined process. The 

connected to an input line 909 allowing a new prediction fetch address is output on line 120 and this full instruction 

strength to be written in from state transition circuitry 112. address is supplied to the memory 111 and at the same time 
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SSfUnn'S??^" 6 10 5° iDpUtS 5 °V ^ 50 }. ° f ff * e associative match resulted in do hit. then the cause 

buffer 400 The fetched instructions are : supplied on line 121 0 f the error was a genuinely new branch and the Ifae tt 

to the processor 110 as previously described. At an appro- < ™u £1° T * «\ S ° " ° UtpUt 923 ° f 311 0R 

priate stage in the processing pipeline, the processor 110 win £2 ^ ° ^ " 1DpUt t0 ktch 903 * 50 

resolve each branch instruction to determine whether it is in 7 Sf, 5 f ° r the new 11116 selected b y signal 

fact, taken. If the processor had received from buffer 400 a 00 * thereby marking that line as valid. Signal 903 

conrect prediction for the branch instruction then no write causes latcn to respond to signal on ltne 904 fr to mark 

operation is required If however no correct prediction bad a new line as invalid 

been received (due to either absence of the branch instruc- 10 ff a VfI0D S prediction has been. made, then the automatic 

tion in the buffer, or a misprediction by the buffer) then the update properties of the described branch target buffer may 

processor outputs signals to the circuit 112 to cause a write. result io a wrong value of interim update of prediction being 

The output on lines 122 comprises the result of resolution. stored Reference should be made to FIGS. 6 and 7 and the 

the prediction for that branch instruction, the appropriate corresponding description. Ir will be seen from this that if 

target address and branch address for the instruction. These 15 the original prediction were "weakly taken" or "weakly not 

are input to the state transition circuitry 112 which provides taken" and the resolution of the branch was in the oooosite 

outputs 900. 907. 908. 909. 912. 920 and 502 to the branch direction. Le. "not taken" or "taken", then a new value of 

target buffer 400 and an output on line 123 to operate the prediction must be written to the prediction store 563 If on 

random number generator 113. The state transition circuitry the other hand the originallv-stored prediction was "stronelv 

is responsive to the reso ution and prediction received on 20 taken" or "strongly not mien" then U rS branch were 

line 122tochangepredrctionorstrengthofprediaion values resolved in the opposite direction the s^en'th bi! must be 

in accordance with- the resolution of the branch instruction re-written AccoidLlv rf^™ r« Itr £ . 

by the processor 110 as has previously been described with U 912 * ^ 

reference to FIGS. 6 and 7 The updated prediction and 1, J ptrmU ? Tltms of * new Prediction, and 

strength of prediction values arfinput onlmS? and M9 „ f" 8 * blt ™* «««ponding line of the 

to the appropriate entry in the buffer 400. together with the £ redictl0n . s . tore 563 and the prediction strength store 571. 

other data to be written, including the target address on line re-wnUng of the value of prediction, or prediction 

908. as already described with reference to FIG. 8. In this strcn gtn- the write enable line 901 6 is kept at a logical 0. thus 

case the output bus 900 outputs the full instruction address P rcven ting writing of a new address into the CAM cell, or 

of the instruction for the write operation into the buffer and a new addres s into the RAM 518. and instead the third 

line 900' forms' a' second r ■•- - - - m in™,n;,.oii:,^„..t.--..,, - • • *- 




912 is set to logic l cones, 

themultiplexorll6 selects L „„„ _ ^ m 

on lines 500 and 501 to carry out an associate operation on ceU Sl0 *) Provides a logical l to the second input of the 
the CAM of the partition denoted by 501 in parallel with the AND gate 911^ and a consequent logical 1 on the output of 
write operation caused by an input on line 900 directly into 35 the second OR gate 906^ This causes the transmission gates 
the buffer 400. The input 900 which is received directly by 918* and 910* to be rendered conductive, which allows the 
the buffer 400 inputs only the word address bits of the new prediction value to be input over line 907 and the new 
ISyj^T ^"ij* fed 10 Ac CAM ***** Prediction strength value to be input over line 909 

fromi JfiliS" r * e ^^Pfc* 01 , } 16 ha f a input 124 In the embodiment described with respect to FIG. 3. no 
from the address line 120 .a second input 125 from line 900 ao prediction is performed in cycles where a write is beine 

tent of the buffer without distmgu^g'whetS Terror* • * w f ^ £ * e abov ? exam P le ***** 

due to a misprediction or to the absence of ti« tenet t ""T f °l ? rcveDt ?S a double entry in the buffer for the 
instruction in fee buffer.' ZxZSZ£Sl ZldL* S 45 ta "* Understood * at due ,0 * e 

forms the same remedial action in bom StuatiTn^oK: CffS^ ' 

i t^..**u • ^- j j , . . instruction first being fetched by the processor and found 

1. fcput the inshuction address to the branch target buffer absent from the buffed and the final resolution of thVbranch 

write J ft?^'r^ Ch '- At Jf ie K Sam l ,ime "ft 10 instroction * ** P roc «** whicb would then resui in a 
S^tii ,S bran * taf S et buffer ' *> output through the state transition circuitry 112 to cause a 

information as if the error had occurred due to absence write into the buffer. Although it is unlikely, it is possible 

ll I T C T m ? e 5f? K the ? redicdon that * e $ame branch instroc ^ D couId «* fetched »^S5 

lw JZlZl ; ?! nlm , e 9 d° " S * ,0 T 10 ! C 1 10 timebytheprocessorbeforethefirstexecutionof thebranch 

sS t n f t0 l bc . W ? t | e ! 1 - 10 case instruction has been resolved and caused a write entry into 

S^itST? al °S lca f 1 lfl P u . , . to AND 55 mebufferltismereforeimpomnttoavoidmepossibmtyof 

E.f Si ? ^ ^ occurs Jo, that partiuon row. writing the same brancblnstruction into the buffer at a 

Sot 3i* ? S1< i , consequently OR gate different location when the instruction was not located in the 

St P Sv * ° a - W °- dlU ? e 3 ° f buffer ^ fast ^ il was fetc hcd by the processor. It is for 

RAM In addiaon. line 920 causes circuit 506 to put the this reason that an associative match on the branch address 

: ffiHSP ^jSSffi^^t 1 ^ b .^. «? is parried out at Input 500^1 fa parallel with the writtfaput.. 

inai row or the RAM. mitc addrcss on bus 900 is used simultaneously for both 

■~ associative match creates a hit. this indicates that operations and while this parallel operation is carried out the 

the branch had already been located fa by the branch line being written to is held invalid If that associate opera- 

target buffer. As a result, the (supposedly new) branch 65 tion finds a hit then the new entry line is maintained as 

entry is terminated by leaving the relevant line as invalid and a line on which the hit occurred has a modiii- 

Invau 'd . cation of the prediction strength and target which have beien 



